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Abstract — RSA is the most widely used public key 
cryptosystem, serving the purpose of key exchange for 
symmetric key cryptography and authentication. Improved 
network security demands forward secrecy, which RSA is 
unable to provide. Also the lower per bit security of the RSA 
technique, makes it difficult to be implemented in resource 
constrained devices. The protocols including Elliptic curve 
Diffie-Hellman Ephemeral (ECDHE) as the key exchange 
mechanism and RSA for authentication, overcomes the 
drawback incurred by the RSA alone and provides forward 
secrecy. The advantage of forward secrecy in a network is 
accompanied with higher complexity and computational cost. 
This paper describes the complete optimized software 
implementation of elliptic curve over the NIST prime field. 
The arithmetic operations over the prime are discussed. 
Different coordinate systems for elliptic curve point 
representation like affine, projective, Jacobian projective, and 
mixed coordinate systems are elaborated. Various techniques 
for scalar multiplication like Binary, NAF, sliding window, 
fixed based window, comb method, are given. Scalar 
multiplication is the most dominating operation in Elliptic 
curve cryptography (ECC) which consumes 85% of the 
execution time. A controller based on the fuzzy logic is 
presented for an optimum selection of window width, w, in the 
scalar multiplication methods. A comparison of various 
techniques and combinations of different techniques to 
perform complete ECDHE operation are provided along with 
the implementation timings. 

Index Terms — forward secrecy, coordinate systems, scalar 
multiplication, ECDHE, Fuzzy controller. 

I. Introduction 

The Transport layer security (TLS) takes care of the 
security issues over a network. It mostly uses RSA or Diffie- 
Hellman (DH), either of them as its key exchange mechanism. 
RSA is the most widely used key exchange mechanism, 
primarily used for the key exchange in SKC type 
communication as the DH based key exchange is more 
expensive. In RSA, one pair of keys namely, public key and 
private key are used during one session, which lasts 
approximately for one month. If this pair of keys is 
compromised in near future, all the information encrypted 
during that session of SKC can be compromised. This is called 
as forward secrecy and the RSA is incapable to provide the 
same [1]. 

In other words, forward secrecy means that the 
information, which is secure in present will also be secure in 
future. The forward secrecy strength depends on the Discrete 
Logarithmic Problem (DLP) of DH key pair [2]. 
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The drawback of high computational cost and complexity 
with DH_RSA approach can be overcome by making use of 
elliptic curve (EC) based DH. Table I shows that the same 
standard of security with reduced number of bits is pursued 
by the Elliptic Curve Cryptography (ECC) compared to DH. 
In ECDH-Ephemeral (ECDHE) technique, as shown in Fig. 1, 
the generated key pair is used for a single session, usually 
lasting for a short duration. 

Table I. Comparable Key Sizes [3] 



DH 


1024 


2048 


3072 


7689 


ECC 


163 


233 


283 


409 



A standard elliptic curve E, specifically for the purpose 
of cryptography over the prime field ( Fp) is given as: 

y 2 mod p = (x 3 + ax + b) mod p, (1) 
where a ,b 6 F P and(4 a 3 + 27b 2 ) mod p ^ [4] . The points 
on E, are calculated using equation (1). Addition of two 
points (Point Addition) and doubling of a point (Point 
Doubling) are considered to the basic operations on EC. The 
mathematical formulae for point addition and point doubling 
are shown in Table II. 

Table II. EC Mathematical Operations [4] 



EC Operations 


Slope(s) 






Point Addition 

PiXuYr) + Q(X 2 ,Y 2 ) 


V2 - yi 

X2 — X-\ 


S 2 - X-\ - X2 


S(x\ - x 3 ) - yi 


Point Doubling 

2P(X 1 ,Y 1 ) 


3x\ + a 
2j/i 


S 2 - 2an 


S(xi -X3)-yi 



The rest of the paper is organized as follows: Section II 
discusses about the arithmetic operations over the prime field, 
Section III briefs about the different coordinate systems for 
point addition and point doubling operation, Section IV in- 
troduces various techniques for scalar multiplication, Sec- 
tion V provides a comparison of different techniques and 
timing results and Section VI gives the conclusion for the 
techniques discussed. 

II. Prime Filed Arithmetic 

This section describes the different arithmetic operations 
used in the Fp during the software implementation of EC 
over NIST prime field. For the same, processor architecture 
of 32-bit is considered for having uniformity during imple 
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mentation. 

A. Field Representation 

The elements, within the prime field Fp, are the integer 
between [0 to P-l] . For the prime field P, m = log2(P) shows 
the number of bits and t = (m/32) shows the size required by 
integer (4 byte) array to store the elements in F p . Prime field 
elements are stored in an array of size t as: 

a = (a t ^i;U t -2 

B. Addition and Subtraction 

The arithmetic addition operation in prime field is 
represented as, c= (a + b) mod P, where a, b and c are 
elements in the prime field. If the addition of exceeds the P, it 
is brought back within the prime field. The modular 
subtraction operation is also performed on the same lines. 
Table III shows the steps for Modular addition and 
subtraction operations. 



Table III. Modular Addition And Modular Subtraction 





Modular Addition 


Modular Subtraction 


Input 


a, b G [0, P-l] 


a, b G [0, P-l] 


Output 


C= (a + b) mod P 


C = (a - b) mod P 


l.Cn 


Add( a , b ) 


Sub(a , f>o) 


2.Fori: 1 tot 


C,; = ADC (a,, 


C,; = SBB (dj, ,bd 


3. If carry =1 


c ■(— c — p 


C i— C + p 


4.Return 


a 


a 



C. Integer Multiplication 

The integer multiplication operation in prime field is much 
more cost computational than addition and subtraction 
operations. Algorithm I gives the steps for multiplication, 
which makes use of 32*32 bit mul instruction and produces a 
64-bit result, da, <ii , di are the variable of size 32-bit, a and b 
array of size t and C is an array of size (2t-l). The result 
generated by Algorithm I is of the size 2m-bits, which is 
brought inside the prime field (reduced to m-bits) with the 
help of modular reduction Algorithm [5] . 

D. Inversion 

Among all the arithmetic operations over the prime field, 
inversion operation is the most expensive one. It is 
approximately 80 times more expensive than integer 
multiplication operation. Extended Euclidean Algorithm (EEA), 
as shown in Algorithm II, is used for calculating the inversion 
of the number in a prime field. 

Algorithm I: Integer Multiplication [5] 
Input: a ; b 6 [0 = P-1] 
Output: C = a * b 

1: Initialize variable d\, d-> to zero 

2: For k: to 2<t-l) do 

3: For i: Q tot do 

4: For j: to t do 

5: If«i+j) = = k) 

6: R — as * bj where R is arrav of size 2 

7: 4^ADD(7i ( ,-R[0]) 

8: d\ -f— ADC( ( j, . R[l]) 

9: d 2 ^- ADC(d 2 , 0) 

10: end For 
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11: end For 

12: C k do, *- 4, fa «u rj 
13:end For 

1 5 : Return C 



Algorithm n Binary Inversion Algorithm [5] 

Input: a £ [0 = P - 1], prime P 

Output: a~ ] mod P 
1: i «-aJ ^p = D f- J„C-*-0 
2 : while i 4- do 
3: while i is even do 

4: , If D is even then D «— D. 2 

else D <— (U-Py2 
5: end while 
6: while j is even do 

7: j «- j/2 : If C is even then C «- C.2 

else C *- (C + p)/2 
8: end while 

9: If i>jtheni^i-j : D^D-C 
10: Else j *-j -i 5 C«-C-D 
11: end while 
12: Return C 



III. Elliptic Curve Cryptography 



The classical Elliptic Curve Diffie Hellman ephemeral 
(ECDHE) scheme is illustrated in Fig. 1 . 



Alice 






Private Key K A 




Private K*vKj 

























Public Key 




K = X* * tn* G 




Sheretl Key * 


r- 


Shared Key 



Figure. 1. Elliptic Curve Diffie-Hellman 

Initially, both the server and the client nodes agree on a 
particular Elliptic curve (EC) over some prime field, Fp, with 
a specific base point, G, termed as the generator point. G is 
one of the valid points on the EC curve, which has highest 
order [4]. Both the server and client nodes set their respec- 
tive private keys by selecting randomly any scalar integer 
from the prime field . The corresponding public keys, Q A 
and Q B , are computed by multiplying the generator point, G, 
with the corresponding private keys namely and . These 
public keys are then shared over the network between the 
server and the client, which again multiply them with their 
corresponding private key, hence generating a shared secret 
key given as: T = K A * Q B = K B * Qa- Due to Elliptic 
Curve Discrete Logarithmic Problem (ECDLP), even though 
the value of Q A , Q B , and G are spread over the network, it 
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would be computationally infeasible to calculate the private 
keys Ka and Kb f° r an intruder [6]. 

The two frequently used operations in ECDHE key 
exchange are: scalar multiplication and modular reduction. 
Scalar multiplication based on ECDLP, consumes 85% of 
computational cost in ECC [7]. It consists of multiplying a 
point on the EC, with a scalar integer k, such that k G Fp- 
Scalar Multiplication comprises of successive summation of 
point addition and point doubling operations. 

Thus the speed of ECDHE key exchange method is 
directly proportional to the performance of the scalar 
multiplication on the EC. The efficiency of the scalar 
multiplication can be improved by using a recoding algorithm 
for the integer, k described in section IV and using different 
coordinate systems for point addition and doubling operations 
described in section III A and B. 

A. Coordinate System 

The point addition (PA) and point doubling (PD) 
operations, as shown in Table I, make use of the affine 
coordinate system. In affine representation, points on the EC 
consist of x and y coordinates. The number of prime field 
arithmetic operations used by PA and PD operation in affine 
coordinate system is given in Table IV. As discussed in section 
II, high execution time of inversion operation in Fp decreases 
the efficiency of PA and PD operation. 

Table IV. Equivalent Prime Field Operation 





Inversion 


Multiplication 


Point Addition (P ^ Q) 


1 


3 


Point Doubling (P = Q) 


1 


4 



B. Projective Coordinate System 

The advantage of representing EC point in this coordinate 
system is that it does not involve inversion operation, while 
performing PA and PD operations. This gives an edge to use 
this system over affine coordinate system. Many variations 
of projective coordinate system have been introduced, and 
out of them the most efficient ones are presented here. The 
first one is Standard projective system (P), in which point is 
represented as (x : y : z) z ^ 0, whose Affine coordinate (A) 
equivalent is given by (x/z, y lz) and the equation of curve 
is given as: y 2 z = x 3 — ixz 2 + bz 2 - In Jacobian projective 
system (J), a point is represented as (x : y : z) z ^ 0, whose 
Affine equivalent is given by [x / z 2 ,y/z 3 ) and the equation 
of curve is given as: y 2 = x 3 — 3iez 4 + bz 6 - In Chudnovsky 
Jacobian coordinate system (C), the Jacobian point, (x : y : z) 
is represented as (x : y : z : z 2 : z 3 ) [5]. 

The formulae for PA and PD, which do not involve 
inversion operation in different variations of projective 
system can be obtained by substituting the respective point 
into it affine equivalent in Table I. The number of prime field 
operations required by different coordinate systems to carry 
out PA and PD is given in Table V. 

It can be observed from Table V that PD in Jacobian 



coordinate system and PA in mixed coordinate system 
(Jacobian + affine) leads to minimum number of prime field 
operations. This will improve the overall efficiency of ECC. 



Table V. Operation Count For Elliptic Curve PA And PD 



Doubling 


Addition 


Mutsd 




1L 2M. 2S 


A4A-»A 


11. 2M. 15 


J+A-+J 


BM. 15 


2P^P 


7M. 35 


P4P-+P 


12M. 25 


J-C- I 


I1K-L35 


:i-j 


4M. 45 


J+J-^J 


12M. J 5 


C+A-^C 


BM. 35 


2C->e 


5M. 45 


C+C-*C 


1IM 35 







The mathematical steps for calculating PD in Jacobian 
coordinate system 2(x t : yi : z{) = (x 3 : y 3 : z 3 ) are: 



A = Axx.yl B = 8yt C = 3(x, - z 2 ).( Xl + z 2 ) 
D = -2A + C 2 , x 3 = D,y 3 = C.{A - D) - B 
z 3 = 2y x .z\ 

The mathematical steps for calculating PAin Mixed coordinate 

[x x : y x : z x ) + (x 2 : y 2 ■ 1) = (x 3 ■ V3 ■ z 3 ) s Y stem are: 

A = x 2 .z 2 , B = y 2 -z 3 , C = A-xi, 

D = B-y u x 3 =D 2 -(C 3 + 2. Xl .C 2 ), 

y 3 = D.(x r .C 2 - x 3 ) - yi. C 3 , z 3 = z x .C 

IV. Scalar Multiplication 

Scalar multiplication on EC is represented as: T = K * G, 
where T, G are the points on the EC and k is an integer in the 
prime field. Different scalar multiplication algorithms, recoding 
the integer, k, to reduce the number of one's in it in order to 
reduce its hamming weight (W). This reduces the number of 
PA operations required for scalar multiplication, hence speed- 
ing up the operation. The scalar multiplication algorithms 
can be classified into two categories depending on G: Un- 
known point and Fixed point. 

Scalar multiplication with unknown point: algorithm be- 
longing to this category is used when G is not known a 
priori. This includes Binary, NAF, Sliding window and NAF 
sliding window method. 

A. Binary Method 

Binary method is the simplest and the most 
computationally expensive scalar multiplication method [8]. 
Binary representation of the integer k helps users to conclude 
that, consecutive summation of the point doubling and point 
addition operations over the EC leads to scalar 
multiplication. 

k = E l j= 1 K^. (2) 

Q = kG = K G + 2{K 1 G + ... + 2(K l - 1 G)). (3) 
In scalar multiplication, point addition (A) and point 
doubling (D) operations are used to determine computational 
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cost of different algorithms. The number of 1 's in the binary 
representation of k, is called its Hamming weight (W) and 1, is 
the total number of bits in k. The computational cost of the 
Binary method is given by equation (4). 

Cost = (W- l)A + (I- 1)D. (4) 

B. NAF Method 

Contrary to the representation of k in Binary method, if 
the representation of k also consists of negative bits, i.e. {-1, 
0, 1 }, then it is called as Binary Signed Digit Representation 
(SDR). Non-adjacent form (NAF) is a specific case of SDR. In 
NAF, both W and 1 are kept as small as possible. An NAF of 
a positive integer, k, is given by the equation (5) [8]. 

fc = Ej=o^ Kj& {-1,0,1}. (5) 
In NAF of k, multiplication of any two consecutive bits is 
always zero i.e. K j * Kj + i = 0. The NAF form of integer, k, 
is denoted as NAF (k), its length is at most (1+1) of the binary 
form of k. Algorithm III is used to convert the integer k into 
its NAF (k). 

Scalar multiplication for NAF (k) is obtained using 
equation (3), the only difference is that when (-1) appears G 
should be subtracted from Q. The W of the integer, k, is 
approximately reduced to (1/3) by using NAF (k) and the 
number of point doubling operations remains to be the same 
as in the Binary method [8] . The computational cost of the 
scalar multiplication using NAF (k) is given by the equation 
(6). 

Cost = ] -A + ID (6) 

Algorithm EQ Binary (k) to NAP (k) [9] 

Input: kE [0.P-1] 
Output: K = NAF (k) 

iri*-Q 

2: while k > 1 do 

3: If k is odd then 

4: Ki «- (2 - (k mod 4». k «- k - K, 

5: Else K ,■ «- 

6: k^-k-2 s i*-i + l 

7: end while 

8: Return K 



Table VI illustrates different examples of NAF (k). 

Table VI. NAF Form Of Integers 



Decimal 
Representation 


Binary 
Representation 


NAF 
Representation 


26 


11010 


101010 


1122334455 


1000 0101 1100 1010 
1110 1101 1110 111 


100010T0 0101 oToT 
oooTooTooooTooi 



C. Sliding Window Method 

To reduce the computational cost of Binary and NAF 
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methods, the digits used for representing k, can be extended 
beyond 3 bits as in NAF i.e. {-1, 0, 1}. This reduces the 
number of point additions. But the advantage comes at a 
cost, a small number of values, that are multiple of G, should 
be pre-computed and stored in memory, such that they are 
added or subtracted to the Q [8] during multiplication. The 
memory required to hold these pre-computed values becomes 
a constraint. The sliding window method processes at most 
consecutive w digits of the scalar integer, k, such that the 
decimal equivalent of the window-w consecutive digits 
should always be odd. The method has no fixed window 
width- w, it can be varied from 1 to w depending on the 
number of zero's at the LSB bit. Algorithm IV presents the 
scalar multiplication for the sliding window method with 
binary representation of integer k. Table VI provides the 
details for the different window widths -w. The computational 
cost for the binary sliding window method is shown in Table 
VII, where V (w) as given in the equation (7), is the average 
length of a run of s within the window [4] . 

4 (-l) w 

v{w) = 3 - h^h (7) 

D. NAF Sliding Window Method 

Algorithm V uses both sliding window method and NAF 
(k). The NAF (k) is computed and the same is given as input 
to this algorithm. 

The combination of sliding window and NAF method, 
reduces the number of pre-computations required compared 
to the combination of binary sliding window methods. This 
improves the efficiency of the algorithm, in a system with 

Alg o rithm IV Bin ary Sli din g win d ow for s c al ar 
multiplication [10] 

Input: Generator p oint G, k = win do w w 

Output: Q = k*G 

1: Calculate [x] G where x = 1 : 3 = 5...., - 1) 

2: j «— 1 - 1 , where 1 is length of k 



3: 


while j > do 




4: 


if (if,- = = 0) 




5: 


Q *- [2]Q , N <- 




<S: 


end if 




7: 


Else 




S: 


i <— maximum (j 


w + 1 , 0) 


9: 


".vliil r Ki = = do 




10: 


i *-i + 1 




11: 


end while 




12: 


For d = 1 to (] - i ■ 


- 1) do 


13: 


d = d - 1 and Q 


-[2]Q 


14: 


end For 




15: 


N^(Xj Kih 




16: 


j — i-1 




17: 


end else 




18: 


Q <— Q iD[N ]G 




19: 


end while 





20:Retum Q 
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Table VII. Different Window Widths Comparison for Sliding Widow Method 



Window 
Width-w 


Number of 
Pre-computations 


Integer 
K=2973 


Intermediate 
Value 


Number of 
addition 


Number of 
doubling 


Pre-computation 


3 


3 


101 11 00 111 01 


5G. 10G, 20G, 23G, 46G, 
92G, 184G, 368G, 736G, 
743G, 1486G, 2972G, 
2973G. 


3 


9 


[3]G. [5]G, [7]G 


5 


15 


10111 00 111 1 


23G, 46G, 92G, 184G, 
368G, 736G, 743G, 
1486G. 2972G, 2973G. 


2 


7 


[3]G, [5]G, [7]G 
[9]G [25]G 
[27] G, [29] G, [31]G 



Algorithm V NAP Sliding window for Scalar 
Multiplication [4] 

Input: Generator Point G r integer k : window w 
Output: Q = k*G 

1 : Compute NAF(k) with Al gorithm 3 . 
2: Calculate [k]G. where x={l. 3. 5. 

= (2 M -(-l) ) 73)-l} 

3: j ■*— 1 - 1 where 1 is the length of k 
4: while j > do 

5 : Algorithm (VI) Steps 4 to 1 7 
6: If(N>0) 
7: Q «- Q + [N ]G 
S: Else Q «- Q - [N ]G 
9: end while 
10: return (Q) 

limited memory. 

The computational cost for the NAF and sliding window 
method is given in Table VII. The computational cost of the 
Algorithms 4 and 5 depends on the window width, w. An 
optimal window width, w, needs to be chosen beforehand, in 
order to reduce the computational cost. 

Table VIII. Computational Cost for Sliding Window Scalar 

MULTIPLICATION 



Method 
[4] 


Number of 
Doubling(D) 


Number of 
Addition (A) 


Number of 
Pre-computation 


Binary 


/ 


I 

w + v(w) 


1D+(2 U '- 1 - 1) 


NAF 


I 


I 

w + v{w) 


ID + 

(2- P 2 "-!) 



Scalar multiplication with fixed point: algorithms in this 
category are used, when the point, G is known a priori. This 
includes Fixed based window and Fixed based comb methods. 

E. Fixed Based Window Method 

As the generator point, G, is already known, more number 
of pre-computed values of G can be stored in the memory. 
Therefore, higher window width-w is considered as compared 
to NAF sliding window method. In this technique, w 
consecutive bits of k are combined and k can be represented 
as:A; = (Kd-i, ,Kx,K ) 2 ™ where d=(L/w). The pre-c 
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omputations required are: 2 wi where > i > (d - 1). The 
Algorithm VI shows scalar multiplication using Fixed Based 
window Method. 

As observed from Algorithm VI, it consists of only point 
addition operation, while performing scalar multiplication but 
the number of PD and PA operation required during the pre- 
computation are high. The computational cost of Fixed based 
window method is given by using equation (8) 

Cost = (2 W - d - 3)4 (8) 

Algorithm VI Scalar Multiplication using Fixed based comb 
method [4] 

Input: Integer k and window w 
Output: Q=k*P 

l: pre-computed value of G 

2: initialize variable t <— 

3: For i : 2™ - 1 to 1 do 

4: for each j. if K t == i then t «- 1 + T :i C 

5: Q«-Q + t 
6: end for 
Return Q 

F. Fixed Base Comb Method 

In fixed based comb method, binary representation of k is 
modified by padding (d*w-t) zeros on the left side. The d 
consecutive bits in k are combined together, which is 
represented as: k = {K w ~ 1 , K°, K 1 )- To accelerate the 
computations, some values of based point are pre-computed, 
given as: 

[a w - 1 ,....,a 2 ,a 1 ,a ]G=a w - 1 2( w -^ d G+.... + a 2 2 2d G+ 
a^G + aoG 

Algorithm VII shows the comb method for scalar 
multiplication. 

Algorithm VII Fixed Based Comb method for Scalar 
Multiplication [4] 

Input: Integer k and window w 

Output: Q=k*G; 
1: pre-computed value of G 

2: Modifying Integer k as:fc = (K 1 "' 1 , ....,K a ,K r ) 
3: Let /^represents ith bit of 
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4: Q^O 

5: Fori: d-1 to do 

6: Q <— 2Q 

7: Q ^ Q + [if™ - 1 , . .., K} , Kf]G 
S: etui For 
9: Return Q 

This method proves to be very efficient, as it uses only 
(d-1) doublings and also the number of pre-computed values 
is comparatively less. The computational cost for this method 
is calculated with the help of equation (9). 

cost= (¥±ld-l)A+(d-l)D (9) 

V. Results And Comparisons 

In this section, the results obtained after the software 
implementation of the various techniques introduced in this 
paper are discussed. A Linux platform, with gcc compiler, is 
used for the implementation. The 192-bit prime field, P and 
the domain parameters, a and b used for the software 
implementation are given in Table VIII. 

Table IX. Domain parameter for 192-bit NIST prime field 



Parameter 


192-bit value (hex) 


-Pi 92 


Ffffffl flir rfffl ^ 


a 


FfJTfjmmarrrj^ 


b 


64210519e59c80e70fa7e9ab72243049feb8deecc 
146b9bl 



The timing results for the arithmetic operations over the 
prime field, Fp, are given in Table LX. It can be observed that 
the inversion operation is much more time consuming than 
other operations. 

Table X. Timings for prime field operations 



Operation 


Timing (us) 


Addition 


5.44 


Subtraction 


2.015 


Multiplication 


17.28 


Inverse 


575.2 



The timing results for point addition and point doubling 
operations over different coordinate systems is given in Table 
X. From Table X, it is proved that the most efficient 
implementation of point doubling operation is obtained in 
Jacobian coordinate system and the most efficient 
implementation of point addition operation is obtained in 
Jacobian + affine, namely mixed coordinate system. 

Table XI. Timings for Point Addition and Point Doubling operation 



Coordinate 


Point 


Point 


System 


Addition 


Doubling 


Affine 


0.598 ms 


0.597 ms 


Jacobian 




0.129 ms 


Jacobian + Affine 


0.132 ms 





The timing results for different algorithms of scalar 
multiplication methods and their comparison in terms of PA 
and PD operation are given in Table XI. In the comparison, 
mixed Jacobian (MJ) coordinate system, where PA uses mixed 
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coordinate system and PD uses Jacobian coordinate system. 

Table XII. Comparison of Different Scalar Multiplication Methods 



Method 


PA 


PD 


Timing 
(second) 


Binary(Affine) 


91 


191 


0.063 


Binary (MJ) 


91 


191 


0.0196 


NAF(Affine) 


64 


191 


0.057 


NAF (MJ) 


64 


191 


0.0179 


Sliding window (affine) 


36 


189 


0.050 


Sliding window (MJ) 


36 


189 


0.0156 


NAF sliding window ( Affine) 


30 


189 


0.048 


NAF sliding window (MJ) 


30 


189 


0.0152 


Fixed Based Comb (Affine) 


56 


63 


0.027 


Fixed Based Comb (MJ) 


56 


63 


0.008 



VI. Conclusion 

In this work, all prime field arithmetic operations, different 
coordinate systems for EC point operation and different scalar 
multiplication methods are discussed and the results are 
compared. It is observed that, the inversion operation in a 
prime field has very high execution time compared to its 
counterpart. It infers that, such type of point addition and 
point doubling operation should be used, which does not 
involve inversion operation, so that over all execution time is 
less. Hence projective, Jacobian and chudnovsky coordinate 
systems are discussed. The Jacobian system for point 
doubling and Jacobian plus affine system for point addition 
proves to most efficient one. Different scalar multiplication 
methods for unknown point namely, Binary, NAF, Sliding 
window and NAF sliding window methods are discussed. 
The NAF sliding window method for a particular window 
width-w proves to outperform the remaining methods as it 
comprises of minimum number of point addition and point 
doubling operations. The fixed based window method and 
fixed based comb method are also discussed for fixed point 
scalar multiplication operation. The comb method proves to 
be much more efficient than the other. The NAF sliding 
window method for unknown point and Comb method for 
fixed point, with the point addition operation in mixed 
coordinated system and point doubling operation in Jacobian 
coordinate system leads to an optimized and efficient 
implementation of ECDHE. 
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